Overview

Brought to you by YData

Dataset statistics

 Train data reportTest data report
Number of variables1211
Number of observations891418
Missing cells866414
Missing cells (%)8.1%9.0%
Duplicate rows00
Duplicate rows (%)0.0%0.0%
Total size in memory83.7 KiB36.1 KiB
Average record size in memory96.1 B88.3 B

Variable types

 Train data reportTest data report
Numeric55
Categorical43
Text33

Alerts

Train data reportTest data report
Sex is highly overall correlated with SurvivedAlert not present in this datasetHigh correlation
Survived is highly overall correlated with SexAlert not present in this datasetHigh correlation
Age has 177 (19.9%) missing values Age has 86 (20.6%) missing values Missing
Cabin has 687 (77.1%) missing values Cabin has 327 (78.2%) missing values Missing
PassengerId is uniformly distributed PassengerId is uniformly distributed Uniform
PassengerId has unique values PassengerId has unique values Unique
Name has unique values Name has unique values Unique
SibSp has 608 (68.2%) zeros SibSp has 283 (67.7%) zeros Zeros
Parch has 678 (76.1%) zeros Parch has 324 (77.5%) zeros Zeros
Fare has 15 (1.7%) zeros Alert not present in this datasetZeros

Reproduction

 Train data reportTest data report
Analysis started2025-02-27 19:42:12.8308952025-02-27 19:42:17.456738
Analysis finished2025-02-27 19:42:14.6730002025-02-27 19:42:19.513129
Duration1.84 second2.06 seconds
Software versionydata-profiling vv4.12.2ydata-profiling vv4.12.2
Download configurationconfig.jsonconfig.json

Variables

PassengerId
Real number (ℝ)

 Train data reportTest data report
Distinct891418
Distinct (%)100.0%100.0%
Missing00
Missing (%)0.0%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean4461100.5
 Train data reportTest data report
Minimum1892
Maximum8911309
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size7.1 KiB3.4 KiB
2025-02-27T13:42:48.200778image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

 Train data reportTest data report
Minimum1892
5-th percentile45.5912.85
Q1223.5996.25
median4461100.5
Q3668.51204.75
95-th percentile846.51288.15
Maximum8911309
Range890417
Interquartile range (IQR)445208.5

Descriptive statistics

 Train data reportTest data report
Standard deviation257.35384120.81046
Coefficient of variation (CV)0.577026550.10977779
Kurtosis-1.2-1.2
Mean4461100.5
Median Absolute Deviation (MAD)223104.5
Skewness00
Sum397386460009
Variance6623114595.167
MonotonicityStrictly increasingStrictly increasing
2025-02-27T13:42:48.336511image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
599 1
 
0.1%
588 1
 
0.1%
589 1
 
0.1%
590 1
 
0.1%
591 1
 
0.1%
592 1
 
0.1%
593 1
 
0.1%
594 1
 
0.1%
595 1
 
0.1%
Other values (881) 881
98.9%
ValueCountFrequency (%)
892 1
 
0.2%
1205 1
 
0.2%
1177 1
 
0.2%
1176 1
 
0.2%
1175 1
 
0.2%
1174 1
 
0.2%
1173 1
 
0.2%
1172 1
 
0.2%
1171 1
 
0.2%
1170 1
 
0.2%
Other values (408) 408
97.6%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
892 1
0.2%
893 1
0.2%
894 1
0.2%
895 1
0.2%
896 1
0.2%
897 1
0.2%
898 1
0.2%
899 1
0.2%
900 1
0.2%
901 1
0.2%
ValueCountFrequency (%)
892 1
0.1%
893 1
0.1%
894 1
0.1%
895 1
0.1%
896 1
0.1%
897 1
0.1%
898 1
0.1%
899 1
0.1%
900 1
0.1%
901 1
0.1%
ValueCountFrequency (%)
1 1
0.2%
2 1
0.2%
3 1
0.2%
4 1
0.2%
5 1
0.2%
6 1
0.2%
7 1
0.2%
8 1
0.2%
9 1
0.2%
10 1
0.2%

Survived
Categorical

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size7.1 KiB
0
549 
1
342 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters891
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 549
61.6%
1 342
38.4%

Length

2025-02-27T13:42:48.402386image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
0 549
61.6%
1 342
38.4%

Most occurring characters

ValueCountFrequency (%)
0 549
61.6%
1 342
38.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 891
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 549
61.6%
1 342
38.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 891
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 549
61.6%
1 342
38.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 891
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 549
61.6%
1 342
38.4%

Pclass
Categorical

 Train data reportTest data report
Distinct33
Distinct (%)0.3%0.7%
Missing00
Missing (%)0.0%0.0%
Memory size7.1 KiB3.4 KiB
3
491 
1
216 
2
184 
3
218 
1
107 
2
93 

Length

 Train data reportTest data report
Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

 Train data reportTest data report
Total characters891418
Distinct characters33
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Train data reportTest data report
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Train data reportTest data report
1st row33
2nd row13
3rd row32
4th row13
5th row33

Common Values

ValueCountFrequency (%)
3 491
55.1%
1 216
24.2%
2 184
 
20.7%
ValueCountFrequency (%)
3 218
52.2%
1 107
25.6%
2 93
22.2%

Length

2025-02-27T13:42:48.443733image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Train data report

2025-02-27T13:42:48.484511image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Test data report

2025-02-27T13:42:48.522975image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
3 491
55.1%
1 216
24.2%
2 184
 
20.7%
ValueCountFrequency (%)
3 218
52.2%
1 107
25.6%
2 93
22.2%

Most occurring characters

ValueCountFrequency (%)
3 491
55.1%
1 216
24.2%
2 184
 
20.7%
ValueCountFrequency (%)
3 218
52.2%
1 107
25.6%
2 93
22.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 891
100.0%
ValueCountFrequency (%)
(unknown) 418
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
3 491
55.1%
1 216
24.2%
2 184
 
20.7%
ValueCountFrequency (%)
3 218
52.2%
1 107
25.6%
2 93
22.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 891
100.0%
ValueCountFrequency (%)
(unknown) 418
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
3 491
55.1%
1 216
24.2%
2 184
 
20.7%
ValueCountFrequency (%)
3 218
52.2%
1 107
25.6%
2 93
22.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 891
100.0%
ValueCountFrequency (%)
(unknown) 418
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
3 491
55.1%
1 216
24.2%
2 184
 
20.7%
ValueCountFrequency (%)
3 218
52.2%
1 107
25.6%
2 93
22.2%

Name
['Text', 'Text']

 Train data reportTest data report
Distinct891418
Distinct (%)100.0%100.0%
Missing00
Missing (%)0.0%0.0%
Memory size7.1 KiB3.4 KiB
2025-02-27T13:42:48.683398image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Length

 Train data reportTest data report
Max length8263
Median length5251
Mean length26.96520827.483254
Min length1213

Characters and Unicode

 Train data reportTest data report
Total characters2402611488
Distinct characters6058
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Train data reportTest data report
Unique891418 ?
Unique (%)100.0%100.0%

Sample

 Train data reportTest data report
1st rowBraund, Mr. Owen HarrisKelly, Mr. James
2nd rowCumings, Mrs. John Bradley (Florence Briggs Thayer)Wilkes, Mrs. James (Ellen Needs)
3rd rowHeikkinen, Miss. LainaMyles, Mr. Thomas Francis
4th rowFutrelle, Mrs. Jacques Heath (Lily May Peel)Wirz, Mr. Albert
5th rowAllen, Mr. William HenryHirvonen, Mrs. Alexander (Helga E Lindqvist)
ValueCountFrequency (%)
mr 521
 
14.4%
miss 182
 
5.0%
mrs 129
 
3.6%
william 64
 
1.8%
john 44
 
1.2%
master 40
 
1.1%
henry 35
 
1.0%
george 24
 
0.7%
james 24
 
0.7%
charles 23
 
0.6%
Other values (1515) 2538
70.0%
ValueCountFrequency (%)
mr 242
 
14.0%
miss 78
 
4.5%
mrs 72
 
4.2%
john 28
 
1.6%
william 23
 
1.3%
master 21
 
1.2%
charles 16
 
0.9%
joseph 15
 
0.9%
james 14
 
0.8%
henry 14
 
0.8%
Other values (825) 1202
69.7%
2025-02-27T13:42:49.119989image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2735
 
11.4%
r 1958
 
8.1%
e 1703
 
7.1%
a 1657
 
6.9%
i 1325
 
5.5%
n 1304
 
5.4%
s 1297
 
5.4%
M 1128
 
4.7%
l 1067
 
4.4%
o 1008
 
4.2%
Other values (50) 8844
36.8%
ValueCountFrequency (%)
1309
 
11.4%
r 971
 
8.5%
e 822
 
7.2%
a 786
 
6.8%
s 628
 
5.5%
i 621
 
5.4%
n 596
 
5.2%
l 526
 
4.6%
M 515
 
4.5%
o 467
 
4.1%
Other values (48) 4247
37.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 24026
100.0%
ValueCountFrequency (%)
(unknown) 11488
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
2735
 
11.4%
r 1958
 
8.1%
e 1703
 
7.1%
a 1657
 
6.9%
i 1325
 
5.5%
n 1304
 
5.4%
s 1297
 
5.4%
M 1128
 
4.7%
l 1067
 
4.4%
o 1008
 
4.2%
Other values (50) 8844
36.8%
ValueCountFrequency (%)
1309
 
11.4%
r 971
 
8.5%
e 822
 
7.2%
a 786
 
6.8%
s 628
 
5.5%
i 621
 
5.4%
n 596
 
5.2%
l 526
 
4.6%
M 515
 
4.5%
o 467
 
4.1%
Other values (48) 4247
37.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 24026
100.0%
ValueCountFrequency (%)
(unknown) 11488
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
2735
 
11.4%
r 1958
 
8.1%
e 1703
 
7.1%
a 1657
 
6.9%
i 1325
 
5.5%
n 1304
 
5.4%
s 1297
 
5.4%
M 1128
 
4.7%
l 1067
 
4.4%
o 1008
 
4.2%
Other values (50) 8844
36.8%
ValueCountFrequency (%)
1309
 
11.4%
r 971
 
8.5%
e 822
 
7.2%
a 786
 
6.8%
s 628
 
5.5%
i 621
 
5.4%
n 596
 
5.2%
l 526
 
4.6%
M 515
 
4.5%
o 467
 
4.1%
Other values (48) 4247
37.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 24026
100.0%
ValueCountFrequency (%)
(unknown) 11488
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
2735
 
11.4%
r 1958
 
8.1%
e 1703
 
7.1%
a 1657
 
6.9%
i 1325
 
5.5%
n 1304
 
5.4%
s 1297
 
5.4%
M 1128
 
4.7%
l 1067
 
4.4%
o 1008
 
4.2%
Other values (50) 8844
36.8%
ValueCountFrequency (%)
1309
 
11.4%
r 971
 
8.5%
e 822
 
7.2%
a 786
 
6.8%
s 628
 
5.5%
i 621
 
5.4%
n 596
 
5.2%
l 526
 
4.6%
M 515
 
4.5%
o 467
 
4.1%
Other values (48) 4247
37.0%

Sex
Categorical

 Train data reportTest data report
Distinct22
Distinct (%)0.2%0.5%
Missing00
Missing (%)0.0%0.0%
Memory size7.1 KiB3.4 KiB
male
577 
female
314 
male
266 
female
152 

Length

 Train data reportTest data report
Max length66
Median length44
Mean length4.7048264.7272727
Min length44

Characters and Unicode

 Train data reportTest data report
Total characters41921976
Distinct characters55
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Train data reportTest data report
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Train data reportTest data report
1st rowmalemale
2nd rowfemalefemale
3rd rowfemalemale
4th rowfemalemale
5th rowmalefemale

Common Values

ValueCountFrequency (%)
male 577
64.8%
female 314
35.2%
ValueCountFrequency (%)
male 266
63.6%
female 152
36.4%

Length

2025-02-27T13:42:49.200581image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Train data report

2025-02-27T13:42:49.242190image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Test data report

2025-02-27T13:42:49.287625image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
male 577
64.8%
female 314
35.2%
ValueCountFrequency (%)
male 266
63.6%
female 152
36.4%

Most occurring characters

ValueCountFrequency (%)
e 1205
28.7%
m 891
21.3%
a 891
21.3%
l 891
21.3%
f 314
 
7.5%
ValueCountFrequency (%)
e 570
28.8%
m 418
21.2%
a 418
21.2%
l 418
21.2%
f 152
 
7.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 4192
100.0%
ValueCountFrequency (%)
(unknown) 1976
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 1205
28.7%
m 891
21.3%
a 891
21.3%
l 891
21.3%
f 314
 
7.5%
ValueCountFrequency (%)
e 570
28.8%
m 418
21.2%
a 418
21.2%
l 418
21.2%
f 152
 
7.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 4192
100.0%
ValueCountFrequency (%)
(unknown) 1976
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 1205
28.7%
m 891
21.3%
a 891
21.3%
l 891
21.3%
f 314
 
7.5%
ValueCountFrequency (%)
e 570
28.8%
m 418
21.2%
a 418
21.2%
l 418
21.2%
f 152
 
7.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 4192
100.0%
ValueCountFrequency (%)
(unknown) 1976
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 1205
28.7%
m 891
21.3%
a 891
21.3%
l 891
21.3%
f 314
 
7.5%
ValueCountFrequency (%)
e 570
28.8%
m 418
21.2%
a 418
21.2%
l 418
21.2%
f 152
 
7.7%

Age
Real number (ℝ)

 Train data reportTest data report
Distinct8879
Distinct (%)12.3%23.8%
Missing17786
Missing (%)19.9%20.6%
Infinite00
Infinite (%)0.0%0.0%
Mean29.69911830.27259
 Train data reportTest data report
Minimum0.420.17
Maximum8076
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size7.1 KiB3.4 KiB
2025-02-27T13:42:49.355026image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

 Train data reportTest data report
Minimum0.420.17
5-th percentile48
Q120.12521
median2827
Q33839
95-th percentile5657
Maximum8076
Range79.5875.83
Interquartile range (IQR)17.87518

Descriptive statistics

 Train data reportTest data report
Standard deviation14.52649714.181209
Coefficient of variation (CV)0.489122190.46845047
Kurtosis0.178274150.083783352
Mean29.69911830.27259
Median Absolute Deviation (MAD)98
Skewness0.389107780.45736129
Sum21205.1710050.5
Variance211.01912201.1067
MonotonicityNot monotonicNot monotonic
2025-02-27T13:42:49.443589image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
24 30
 
3.4%
22 27
 
3.0%
18 26
 
2.9%
28 25
 
2.8%
30 25
 
2.8%
19 25
 
2.8%
21 24
 
2.7%
25 23
 
2.6%
36 22
 
2.5%
29 20
 
2.2%
Other values (78) 467
52.4%
(Missing) 177
 
19.9%
ValueCountFrequency (%)
24 17
 
4.1%
21 17
 
4.1%
22 16
 
3.8%
30 15
 
3.6%
18 13
 
3.1%
27 12
 
2.9%
26 12
 
2.9%
23 11
 
2.6%
25 11
 
2.6%
29 10
 
2.4%
Other values (69) 198
47.4%
(Missing) 86
20.6%
ValueCountFrequency (%)
0.42 1
 
0.1%
0.67 1
 
0.1%
0.75 2
 
0.2%
0.83 2
 
0.2%
0.92 1
 
0.1%
1 7
0.8%
2 10
1.1%
3 6
0.7%
4 10
1.1%
5 4
 
0.4%
ValueCountFrequency (%)
0.17 1
 
0.2%
0.33 1
 
0.2%
0.75 1
 
0.2%
0.83 1
 
0.2%
0.92 1
 
0.2%
1 3
0.7%
2 2
0.5%
3 1
 
0.2%
5 1
 
0.2%
6 3
0.7%
ValueCountFrequency (%)
0.17 1
 
0.1%
0.33 1
 
0.1%
0.75 1
 
0.1%
0.83 1
 
0.1%
0.92 1
 
0.1%
1 3
0.3%
2 2
0.2%
3 1
 
0.1%
5 1
 
0.1%
6 3
0.3%
ValueCountFrequency (%)
0.42 1
 
0.2%
0.67 1
 
0.2%
0.75 2
 
0.5%
0.83 2
 
0.5%
0.92 1
 
0.2%
1 7
1.7%
2 10
2.4%
3 6
1.4%
4 10
2.4%
5 4
 
1.0%

SibSp
Real number (ℝ)

 Train data reportTest data report
Distinct77
Distinct (%)0.8%1.7%
Missing00
Missing (%)0.0%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean0.523007860.44736842
 Train data reportTest data report
Minimum00
Maximum88
Zeros608283
Zeros (%)68.2%67.7%
Negative00
Negative (%)0.0%0.0%
Memory size7.1 KiB3.4 KiB
2025-02-27T13:42:49.697288image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

 Train data reportTest data report
Minimum00
5-th percentile00
Q100
median00
Q311
95-th percentile32
Maximum88
Range88
Interquartile range (IQR)11

Descriptive statistics

 Train data reportTest data report
Standard deviation1.10274340.89675956
Coefficient of variation (CV)2.10846442.0045214
Kurtosis17.8804226.498712
Mean0.523007860.44736842
Median Absolute Deviation (MAD)00
Skewness3.69535174.1683366
Sum466187
Variance1.21604310.80417771
MonotonicityNot monotonicNot monotonic
2025-02-27T13:42:49.741025image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0 608
68.2%
1 209
 
23.5%
2 28
 
3.1%
4 18
 
2.0%
3 16
 
1.8%
8 7
 
0.8%
5 5
 
0.6%
ValueCountFrequency (%)
0 283
67.7%
1 110
 
26.3%
2 14
 
3.3%
3 4
 
1.0%
4 4
 
1.0%
8 2
 
0.5%
5 1
 
0.2%
ValueCountFrequency (%)
0 608
68.2%
1 209
 
23.5%
2 28
 
3.1%
3 16
 
1.8%
4 18
 
2.0%
5 5
 
0.6%
8 7
 
0.8%
ValueCountFrequency (%)
0 283
67.7%
1 110
 
26.3%
2 14
 
3.3%
3 4
 
1.0%
4 4
 
1.0%
5 1
 
0.2%
8 2
 
0.5%
ValueCountFrequency (%)
0 283
31.8%
1 110
 
12.3%
2 14
 
1.6%
3 4
 
0.4%
4 4
 
0.4%
5 1
 
0.1%
8 2
 
0.2%
ValueCountFrequency (%)
0 608
145.5%
1 209
 
50.0%
2 28
 
6.7%
3 16
 
3.8%
4 18
 
4.3%
5 5
 
1.2%
8 7
 
1.7%

Parch
Real number (ℝ)

 Train data reportTest data report
Distinct78
Distinct (%)0.8%1.9%
Missing00
Missing (%)0.0%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean0.381593710.3923445
 Train data reportTest data report
Minimum00
Maximum69
Zeros678324
Zeros (%)76.1%77.5%
Negative00
Negative (%)0.0%0.0%
Memory size7.1 KiB3.4 KiB
2025-02-27T13:42:49.788322image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

 Train data reportTest data report
Minimum00
5-th percentile00
Q100
median00
Q300
95-th percentile22
Maximum69
Range69
Interquartile range (IQR)00

Descriptive statistics

 Train data reportTest data report
Standard deviation0.806057220.98142888
Coefficient of variation (CV)2.11234412.5014468
Kurtosis9.778125231.412513
Mean0.381593710.3923445
Median Absolute Deviation (MAD)00
Skewness2.7491174.6544617
Sum340164
Variance0.649728240.96320264
MonotonicityNot monotonicNot monotonic
2025-02-27T13:42:49.847556image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0 678
76.1%
1 118
 
13.2%
2 80
 
9.0%
5 5
 
0.6%
3 5
 
0.6%
4 4
 
0.4%
6 1
 
0.1%
ValueCountFrequency (%)
0 324
77.5%
1 52
 
12.4%
2 33
 
7.9%
3 3
 
0.7%
4 2
 
0.5%
9 2
 
0.5%
6 1
 
0.2%
5 1
 
0.2%
ValueCountFrequency (%)
0 678
76.1%
1 118
 
13.2%
2 80
 
9.0%
3 5
 
0.6%
4 4
 
0.4%
5 5
 
0.6%
6 1
 
0.1%
ValueCountFrequency (%)
0 324
77.5%
1 52
 
12.4%
2 33
 
7.9%
3 3
 
0.7%
4 2
 
0.5%
5 1
 
0.2%
6 1
 
0.2%
9 2
 
0.5%
ValueCountFrequency (%)
0 324
36.4%
1 52
 
5.8%
2 33
 
3.7%
3 3
 
0.3%
4 2
 
0.2%
5 1
 
0.1%
6 1
 
0.1%
9 2
 
0.2%
ValueCountFrequency (%)
0 678
162.2%
1 118
 
28.2%
2 80
 
19.1%
3 5
 
1.2%
4 4
 
1.0%
5 5
 
1.2%
6 1
 
0.2%

Ticket
['Text', 'Text']

 Train data reportTest data report
Distinct681363
Distinct (%)76.4%86.8%
Missing00
Missing (%)0.0%0.0%
Memory size7.1 KiB3.4 KiB
2025-02-27T13:42:50.032787image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Length

 Train data reportTest data report
Max length1818
Median length1717
Mean length6.75084186.8755981
Min length33

Characters and Unicode

 Train data reportTest data report
Total characters60152874
Distinct characters3532
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Train data reportTest data report
Unique547321 ?
Unique (%)61.4%76.8%

Sample

 Train data reportTest data report
1st rowA/5 21171330911
2nd rowPC 17599363272
3rd rowSTON/O2. 3101282240276
4th row113803315154
5th row3734503101298
ValueCountFrequency (%)
pc 60
 
5.3%
c.a 27
 
2.4%
a/5 17
 
1.5%
ca 14
 
1.2%
ston/o 12
 
1.1%
2 12
 
1.1%
sc/paris 9
 
0.8%
w./c 9
 
0.8%
soton/o.q 8
 
0.7%
347082 7
 
0.6%
Other values (709) 955
84.5%
ValueCountFrequency (%)
pc 32
 
5.9%
c.a 19
 
3.5%
ca 8
 
1.5%
soton/o.q 8
 
1.5%
sc/paris 7
 
1.3%
17608 5
 
0.9%
2 5
 
0.9%
a/5 5
 
0.9%
w./c 5
 
0.9%
f.c.c 4
 
0.7%
Other values (383) 445
82.0%
2025-02-27T13:42:50.327782image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 746
12.4%
1 689
11.5%
2 594
9.9%
7 490
8.1%
4 464
 
7.7%
6 422
 
7.0%
0 406
 
6.7%
5 387
 
6.4%
9 328
 
5.5%
8 282
 
4.7%
Other values (25) 1207
20.1%
ValueCountFrequency (%)
3 364
12.7%
1 311
10.8%
2 268
9.3%
7 207
 
7.2%
6 206
 
7.2%
0 204
 
7.1%
5 195
 
6.8%
4 188
 
6.5%
8 144
 
5.0%
9 137
 
4.8%
Other values (22) 650
22.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 6015
100.0%
ValueCountFrequency (%)
(unknown) 2874
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
3 746
12.4%
1 689
11.5%
2 594
9.9%
7 490
8.1%
4 464
 
7.7%
6 422
 
7.0%
0 406
 
6.7%
5 387
 
6.4%
9 328
 
5.5%
8 282
 
4.7%
Other values (25) 1207
20.1%
ValueCountFrequency (%)
3 364
12.7%
1 311
10.8%
2 268
9.3%
7 207
 
7.2%
6 206
 
7.2%
0 204
 
7.1%
5 195
 
6.8%
4 188
 
6.5%
8 144
 
5.0%
9 137
 
4.8%
Other values (22) 650
22.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 6015
100.0%
ValueCountFrequency (%)
(unknown) 2874
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
3 746
12.4%
1 689
11.5%
2 594
9.9%
7 490
8.1%
4 464
 
7.7%
6 422
 
7.0%
0 406
 
6.7%
5 387
 
6.4%
9 328
 
5.5%
8 282
 
4.7%
Other values (25) 1207
20.1%
ValueCountFrequency (%)
3 364
12.7%
1 311
10.8%
2 268
9.3%
7 207
 
7.2%
6 206
 
7.2%
0 204
 
7.1%
5 195
 
6.8%
4 188
 
6.5%
8 144
 
5.0%
9 137
 
4.8%
Other values (22) 650
22.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 6015
100.0%
ValueCountFrequency (%)
(unknown) 2874
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
3 746
12.4%
1 689
11.5%
2 594
9.9%
7 490
8.1%
4 464
 
7.7%
6 422
 
7.0%
0 406
 
6.7%
5 387
 
6.4%
9 328
 
5.5%
8 282
 
4.7%
Other values (25) 1207
20.1%
ValueCountFrequency (%)
3 364
12.7%
1 311
10.8%
2 268
9.3%
7 207
 
7.2%
6 206
 
7.2%
0 204
 
7.1%
5 195
 
6.8%
4 188
 
6.5%
8 144
 
5.0%
9 137
 
4.8%
Other values (22) 650
22.6%

Fare
Real number (ℝ)

 Train data reportTest data report
Distinct248169
Distinct (%)27.8%40.5%
Missing01
Missing (%)0.0%0.2%
Infinite00
Infinite (%)0.0%0.0%
Mean32.20420835.627188
 Train data reportTest data report
Minimum00
Maximum512.3292512.3292
Zeros152
Zeros (%)1.7%0.5%
Negative00
Negative (%)0.0%0.0%
Memory size7.1 KiB3.4 KiB
2025-02-27T13:42:50.412230image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

 Train data reportTest data report
Minimum00
5-th percentile7.2257.2292
Q17.91047.8958
median14.454214.4542
Q33131.5
95-th percentile112.07915151.55
Maximum512.3292512.3292
Range512.3292512.3292
Interquartile range (IQR)23.089623.6042

Descriptive statistics

 Train data reportTest data report
Standard deviation49.69342955.907576
Coefficient of variation (CV)1.54307251.5692391
Kurtosis33.39814117.921595
Mean32.20420835.627188
Median Absolute Deviation (MAD)6.90426.825
Skewness4.78731653.6872133
Sum28693.94914856.538
Variance2469.43683125.6571
MonotonicityNot monotonicNot monotonic
2025-02-27T13:42:50.482843image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8.05 43
 
4.8%
13 42
 
4.7%
7.8958 38
 
4.3%
7.75 34
 
3.8%
26 31
 
3.5%
10.5 24
 
2.7%
7.925 18
 
2.0%
7.775 16
 
1.8%
7.2292 15
 
1.7%
0 15
 
1.7%
Other values (238) 615
69.0%
ValueCountFrequency (%)
7.75 21
 
5.0%
26 19
 
4.5%
8.05 17
 
4.1%
13 17
 
4.1%
10.5 11
 
2.6%
7.8958 11
 
2.6%
7.775 10
 
2.4%
7.2292 9
 
2.2%
7.225 9
 
2.2%
7.8542 8
 
1.9%
Other values (159) 285
68.2%
ValueCountFrequency (%)
0 15
1.7%
4.0125 1
 
0.1%
5 1
 
0.1%
6.2375 1
 
0.1%
6.4375 1
 
0.1%
6.45 1
 
0.1%
6.4958 2
 
0.2%
6.75 2
 
0.2%
6.8583 1
 
0.1%
6.95 1
 
0.1%
ValueCountFrequency (%)
0 2
 
0.5%
3.1708 1
 
0.2%
6.4375 2
 
0.5%
6.4958 1
 
0.2%
6.95 1
 
0.2%
7 2
 
0.5%
7.05 2
 
0.5%
7.225 9
2.2%
7.2292 9
2.2%
7.25 5
1.2%
ValueCountFrequency (%)
0 2
 
0.2%
3.1708 1
 
0.1%
6.4375 2
 
0.2%
6.4958 1
 
0.1%
6.95 1
 
0.1%
7 2
 
0.2%
7.05 2
 
0.2%
7.225 9
1.0%
7.2292 9
1.0%
7.25 5
0.6%
ValueCountFrequency (%)
0 15
3.6%
4.0125 1
 
0.2%
5 1
 
0.2%
6.2375 1
 
0.2%
6.4375 1
 
0.2%
6.45 1
 
0.2%
6.4958 2
 
0.5%
6.75 2
 
0.5%
6.8583 1
 
0.2%
6.95 1
 
0.2%

Cabin
['Text', 'Text']

 Train data reportTest data report
Distinct14776
Distinct (%)72.1%83.5%
Missing687327
Missing (%)77.1%78.2%
Memory size7.1 KiB3.4 KiB
2025-02-27T13:42:50.652206image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Length

 Train data reportTest data report
Max length1515
Median length33
Mean length3.58823534.0769231
Min length11

Characters and Unicode

 Train data reportTest data report
Total characters732371
Distinct characters1918
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Train data reportTest data report
Unique10162 ?
Unique (%)49.5%68.1%

Sample

 Train data reportTest data report
1st rowC85B45
2nd rowC123E31
3rd rowE46B57 B59 B63 B66
4th rowG6B36
5th rowC103A21
ValueCountFrequency (%)
c23 4
 
1.7%
c27 4
 
1.7%
g6 4
 
1.7%
b96 4
 
1.7%
b98 4
 
1.7%
f 4
 
1.7%
c25 4
 
1.7%
f33 3
 
1.3%
e101 3
 
1.3%
f2 3
 
1.3%
Other values (151) 201
84.5%
ValueCountFrequency (%)
f 4
 
3.4%
b57 3
 
2.5%
b63 3
 
2.5%
b66 3
 
2.5%
b59 3
 
2.5%
c27 2
 
1.7%
e46 2
 
1.7%
c6 2
 
1.7%
c78 2
 
1.7%
b45 2
 
1.7%
Other values (80) 92
78.0%
2025-02-27T13:42:50.930379image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 72
 
9.8%
C 71
 
9.7%
B 64
 
8.7%
1 61
 
8.3%
3 59
 
8.1%
6 51
 
7.0%
5 45
 
6.1%
4 37
 
5.1%
8 37
 
5.1%
34
 
4.6%
Other values (9) 201
27.5%
ValueCountFrequency (%)
C 43
11.6%
5 34
9.2%
1 33
 
8.9%
B 32
 
8.6%
6 30
 
8.1%
3 28
 
7.5%
27
 
7.3%
2 25
 
6.7%
4 21
 
5.7%
7 15
 
4.0%
Other values (8) 83
22.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 732
100.0%
ValueCountFrequency (%)
(unknown) 371
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
2 72
 
9.8%
C 71
 
9.7%
B 64
 
8.7%
1 61
 
8.3%
3 59
 
8.1%
6 51
 
7.0%
5 45
 
6.1%
4 37
 
5.1%
8 37
 
5.1%
34
 
4.6%
Other values (9) 201
27.5%
ValueCountFrequency (%)
C 43
11.6%
5 34
9.2%
1 33
 
8.9%
B 32
 
8.6%
6 30
 
8.1%
3 28
 
7.5%
27
 
7.3%
2 25
 
6.7%
4 21
 
5.7%
7 15
 
4.0%
Other values (8) 83
22.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 732
100.0%
ValueCountFrequency (%)
(unknown) 371
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
2 72
 
9.8%
C 71
 
9.7%
B 64
 
8.7%
1 61
 
8.3%
3 59
 
8.1%
6 51
 
7.0%
5 45
 
6.1%
4 37
 
5.1%
8 37
 
5.1%
34
 
4.6%
Other values (9) 201
27.5%
ValueCountFrequency (%)
C 43
11.6%
5 34
9.2%
1 33
 
8.9%
B 32
 
8.6%
6 30
 
8.1%
3 28
 
7.5%
27
 
7.3%
2 25
 
6.7%
4 21
 
5.7%
7 15
 
4.0%
Other values (8) 83
22.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 732
100.0%
ValueCountFrequency (%)
(unknown) 371
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
2 72
 
9.8%
C 71
 
9.7%
B 64
 
8.7%
1 61
 
8.3%
3 59
 
8.1%
6 51
 
7.0%
5 45
 
6.1%
4 37
 
5.1%
8 37
 
5.1%
34
 
4.6%
Other values (9) 201
27.5%
ValueCountFrequency (%)
C 43
11.6%
5 34
9.2%
1 33
 
8.9%
B 32
 
8.6%
6 30
 
8.1%
3 28
 
7.5%
27
 
7.3%
2 25
 
6.7%
4 21
 
5.7%
7 15
 
4.0%
Other values (8) 83
22.4%

Embarked
Categorical

 Train data reportTest data report
Distinct33
Distinct (%)0.3%0.7%
Missing20
Missing (%)0.2%0.0%
Memory size7.1 KiB3.4 KiB
S
644 
C
168 
Q
77 
S
270 
C
102 
Q
46 

Length

 Train data reportTest data report
Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

 Train data reportTest data report
Total characters889418
Distinct characters33
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Train data reportTest data report
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Train data reportTest data report
1st rowSQ
2nd rowCS
3rd rowSQ
4th rowSS
5th rowSS

Common Values

ValueCountFrequency (%)
S 644
72.3%
C 168
 
18.9%
Q 77
 
8.6%
(Missing) 2
 
0.2%
ValueCountFrequency (%)
S 270
64.6%
C 102
 
24.4%
Q 46
 
11.0%

Length

2025-02-27T13:42:51.001109image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Train data report

2025-02-27T13:42:51.040735image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Test data report

2025-02-27T13:42:51.092863image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
s 644
72.4%
c 168
 
18.9%
q 77
 
8.7%
ValueCountFrequency (%)
s 270
64.6%
c 102
 
24.4%
q 46
 
11.0%

Most occurring characters

ValueCountFrequency (%)
S 644
72.4%
C 168
 
18.9%
Q 77
 
8.7%
ValueCountFrequency (%)
S 270
64.6%
C 102
 
24.4%
Q 46
 
11.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 889
100.0%
ValueCountFrequency (%)
(unknown) 418
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
S 644
72.4%
C 168
 
18.9%
Q 77
 
8.7%
ValueCountFrequency (%)
S 270
64.6%
C 102
 
24.4%
Q 46
 
11.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 889
100.0%
ValueCountFrequency (%)
(unknown) 418
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
S 644
72.4%
C 168
 
18.9%
Q 77
 
8.7%
ValueCountFrequency (%)
S 270
64.6%
C 102
 
24.4%
Q 46
 
11.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 889
100.0%
ValueCountFrequency (%)
(unknown) 418
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
S 644
72.4%
C 168
 
18.9%
Q 77
 
8.7%
ValueCountFrequency (%)
S 270
64.6%
C 102
 
24.4%
Q 46
 
11.0%

Interactions

Train data report

2025-02-27T13:42:14.169634image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Test data report

2025-02-27T13:42:18.979467image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Train data report

2025-02-27T13:42:13.048862image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Test data report

2025-02-27T13:42:17.818239image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Train data report

2025-02-27T13:42:13.399667image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Test data report

2025-02-27T13:42:18.131707image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Train data report

2025-02-27T13:42:13.752045image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Test data report

2025-02-27T13:42:18.390790image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Train data report

2025-02-27T13:42:13.965887image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Test data report

2025-02-27T13:42:18.682795image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Train data report

2025-02-27T13:42:14.298558image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Test data report

2025-02-27T13:42:19.043472image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Train data report

2025-02-27T13:42:13.141700image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Test data report

2025-02-27T13:42:17.878205image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Train data report

2025-02-27T13:42:13.535869image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Test data report

2025-02-27T13:42:18.187059image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Train data report

2025-02-27T13:42:13.793048image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Test data report

2025-02-27T13:42:18.446534image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Train data report

2025-02-27T13:42:14.002774image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Test data report

2025-02-27T13:42:18.743120image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Train data report

2025-02-27T13:42:14.339501image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Test data report

2025-02-27T13:42:19.095503image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Train data report

2025-02-27T13:42:13.206111image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Test data report

2025-02-27T13:42:17.954241image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Train data report

2025-02-27T13:42:13.583739image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Test data report

2025-02-27T13:42:18.237715image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Train data report

2025-02-27T13:42:13.833947image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Test data report

2025-02-27T13:42:18.524930image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Train data report

2025-02-27T13:42:14.043169image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Test data report

2025-02-27T13:42:18.807113image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Train data report

2025-02-27T13:42:14.382445image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Test data report

2025-02-27T13:42:19.132717image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Train data report

2025-02-27T13:42:13.286531image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Test data report

2025-02-27T13:42:18.010269image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Train data report

2025-02-27T13:42:13.628442image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Test data report

2025-02-27T13:42:18.293517image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Train data report

2025-02-27T13:42:13.880427image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Test data report

2025-02-27T13:42:18.575991image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Train data report

2025-02-27T13:42:14.091872image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Test data report

2025-02-27T13:42:18.876724image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Train data report

2025-02-27T13:42:14.421408image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Test data report

2025-02-27T13:42:19.208300image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Train data report

2025-02-27T13:42:13.353349image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Test data report

2025-02-27T13:42:18.088855image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Train data report

2025-02-27T13:42:13.688864image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Test data report

2025-02-27T13:42:18.338650image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Train data report

2025-02-27T13:42:13.927479image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Test data report

2025-02-27T13:42:18.629921image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Train data report

2025-02-27T13:42:14.129378image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Test data report

2025-02-27T13:42:18.927900image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Correlations

Train data report

2025-02-27T13:42:51.126190image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Test data report

2025-02-27T13:42:51.181218image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Train data report

AgeEmbarkedFareParchPassengerIdPclassSexSibSpSurvived
Age1.0000.0650.135-0.2540.0410.2690.099-0.1820.155
Embarked0.0651.0000.1960.0520.0000.2600.1130.0920.166
Fare0.1350.1961.0000.410-0.0140.4790.1890.4470.283
Parch-0.2540.0520.4101.0000.0010.0220.2470.4500.157
PassengerId0.0410.000-0.0140.0011.0000.0320.066-0.0610.104
Pclass0.2690.2600.4790.0220.0321.0000.1300.1480.337
Sex0.0990.1130.1890.2470.0660.1301.0000.2060.540
SibSp-0.1820.0920.4470.450-0.0610.1480.2061.0000.187
Survived0.1550.1660.2830.1570.1040.3370.5400.1871.000

Test data report

AgeEmbarkedFareParchPassengerIdPclassSexSibSp
Age1.0000.1350.315-0.130-0.0190.3490.000-0.015
Embarked0.1351.0000.2400.1130.0600.3080.1090.101
Fare0.3150.2401.0000.3780.0200.4750.1540.441
Parch-0.1300.1130.3781.0000.0510.0000.2130.412
PassengerId-0.0190.0600.0200.0511.0000.0540.000-0.010
Pclass0.3490.3080.4750.0000.0541.0000.1060.113
Sex0.0000.1090.1540.2130.0000.1061.0000.136
SibSp-0.0150.1010.4410.412-0.0100.1130.1361.000

Missing values

Train data report

2025-02-27T13:42:14.482209image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
A simple visualization of nullity by column.

Test data report

2025-02-27T13:42:19.283215image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
A simple visualization of nullity by column.

Train data report

2025-02-27T13:42:14.554191image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Test data report

2025-02-27T13:42:19.377143image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Train data report

2025-02-27T13:42:14.633920image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Test data report

2025-02-27T13:42:19.480000image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

Train data report

PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarked
0103Braund, Mr. Owen Harrismale22.010A/5 211717.2500NaNS
1211Cumings, Mrs. John Bradley (Florence Briggs Thayer)female38.010PC 1759971.2833C85C
2313Heikkinen, Miss. Lainafemale26.000STON/O2. 31012827.9250NaNS
3411Futrelle, Mrs. Jacques Heath (Lily May Peel)female35.01011380353.1000C123S
4503Allen, Mr. William Henrymale35.0003734508.0500NaNS
5603Moran, Mr. JamesmaleNaN003308778.4583NaNQ
6701McCarthy, Mr. Timothy Jmale54.0001746351.8625E46S
7803Palsson, Master. Gosta Leonardmale2.03134990921.0750NaNS
8913Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)female27.00234774211.1333NaNS
91012Nasser, Mrs. Nicholas (Adele Achem)female14.01023773630.0708NaNC

Test data report

PassengerIdPclassNameSexAgeSibSpParchTicketFareCabinEmbarked
08923Kelly, Mr. Jamesmale34.5003309117.8292NaNQ
18933Wilkes, Mrs. James (Ellen Needs)female47.0103632727.0000NaNS
28942Myles, Mr. Thomas Francismale62.0002402769.6875NaNQ
38953Wirz, Mr. Albertmale27.0003151548.6625NaNS
48963Hirvonen, Mrs. Alexander (Helga E Lindqvist)female22.011310129812.2875NaNS
58973Svensson, Mr. Johan Cervinmale14.00075389.2250NaNS
68983Connolly, Miss. Katefemale30.0003309727.6292NaNQ
78992Caldwell, Mr. Albert Francismale26.01124873829.0000NaNS
89003Abrahim, Mrs. Joseph (Sophie Halaut Easu)female18.00026577.2292NaNC
99013Davies, Mr. John Samuelmale21.020A/4 4887124.1500NaNS

Train data report

PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarked
88188203Markun, Mr. Johannmale33.0003492577.8958NaNS
88288303Dahlberg, Miss. Gerda Ulrikafemale22.000755210.5167NaNS
88388402Banfield, Mr. Frederick Jamesmale28.000C.A./SOTON 3406810.5000NaNS
88488503Sutehall, Mr. Henry Jrmale25.000SOTON/OQ 3920767.0500NaNS
88588603Rice, Mrs. William (Margaret Norton)female39.00538265229.1250NaNQ
88688702Montvila, Rev. Juozasmale27.00021153613.0000NaNS
88788811Graham, Miss. Margaret Edithfemale19.00011205330.0000B42S
88888903Johnston, Miss. Catherine Helen "Carrie"femaleNaN12W./C. 660723.4500NaNS
88989011Behr, Mr. Karl Howellmale26.00011136930.0000C148C
89089103Dooley, Mr. Patrickmale32.0003703767.7500NaNQ

Test data report

PassengerIdPclassNameSexAgeSibSpParchTicketFareCabinEmbarked
40813003Riordan, Miss. Johanna Hannah""femaleNaN003349157.7208NaNQ
40913013Peacock, Miss. Treasteallfemale3.011SOTON/O.Q. 310131513.7750NaNS
41013023Naughton, Miss. HannahfemaleNaN003652377.7500NaNQ
41113031Minahan, Mrs. William Edward (Lillian E Thorpe)female37.0101992890.0000C78Q
41213043Henriksson, Miss. Jenny Lovisafemale28.0003470867.7750NaNS
41313053Spector, Mr. WoolfmaleNaN00A.5. 32368.0500NaNS
41413061Oliva y Ocana, Dona. Ferminafemale39.000PC 17758108.9000C105C
41513073Saether, Mr. Simon Sivertsenmale38.500SOTON/O.Q. 31012627.2500NaNS
41613083Ware, Mr. FrederickmaleNaN003593098.0500NaNS
41713093Peter, Master. Michael JmaleNaN11266822.3583NaNC

Duplicate rows

Train data report

PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarked# duplicates
Dataset does not contain duplicate rows.

Test data report

PassengerIdPclassNameSexAgeSibSpParchTicketFareCabinEmbarked# duplicates
Dataset does not contain duplicate rows.